How Intelligent Page Routing Makes AI Takeoff Accurate: The Classification Layer That Changes Everything
If you could improve only one thing about an AI construction takeoff system, it should not be the extraction model. It should be the page classifier.
This is counterintuitive. Most attention in construction AI goes to the extraction step — how well does the model identify symbols, count devices, and read specifications? But in production, the classification layer — deciding what type of drawing each page is and which extraction approach to use — has a larger impact on final accuracy than the extraction models themselves.
The reason is simple: if you use the wrong extraction approach on a page, even a perfect model will produce wrong results. An electrical extractor running on a mechanical plan will confidently report electrical devices that do not exist. A floor plan extractor running on a panel schedule will miss all the tabular data. A fire alarm extractor running on a lighting plan will misidentify every luminaire as a smoke detector.
This post explains how intelligent page routing works, why it matters so much, and what makes a routing system reliable at scale.
What Page Routing Actually Means
Page routing is the process of examining each page in a construction PDF and determining:
- What discipline(s) are present — electrical, mechanical, plumbing, fire protection, architectural, structural, civil, etc.
- What drawing type it is — floor plan, schedule, schematic, detail, riser diagram, section, etc.
- Which extraction approach(es) should handle it — symbol counting, table parsing, schematic analysis, measurement, etc.
This sounds simple. It is not.
A 200-page bid set might contain:
- 40 architectural plans (floor plans, reflected ceiling plans, sections, details, schedules)
- 35 electrical sheets (power plans, lighting plans, panel schedules, SLDs, fire alarm, low voltage)
- 30 mechanical sheets (duct plans, piping plans, equipment schedules, details)
- 20 plumbing sheets (fixture plans, riser diagrams, details)
- 15 structural sheets (framing plans, details, schedules)
- 10 civil sheets (site plans, grading, utility plans)
- 10 specification/general sheets (cover, index, legends, general notes, abbreviations)
- Various combined or specialty sheets
Each of these requires different handling. And many real-world drawings do not fall neatly into a single category.
The Three Layers of Page Routing
Effective page routing uses multiple layers, each catching cases the others miss.
Layer 1: Title Block and Sheet Name Analysis
The cheapest and most reliable signal is the drawing title itself. Construction drawings follow naming conventions that encode discipline and type information:
- "E-101 FIRST FLOOR POWER PLAN" — electrical (E), floor plan, power
- "M-201 SECOND FLOOR DUCT PLAN" — mechanical (M), floor plan, ductwork
- "P-001 PLUMBING RISER DIAGRAM" — plumbing (P), riser diagram
- "PANEL SCHEDULE 'LP-1A'" — electrical, panel schedule
Title analysis uses pattern matching to extract this structured information. It is fast (no AI computation needed), reliable when titles follow conventions, and provides a strong baseline classification.
Where title analysis works well:
- Well-organized drawing sets from established firms
- Government projects with standardized naming (CPWD, state PWD, federal agencies)
- Sets that follow AIA or NIST sheet naming conventions
Where title analysis fails:
- Scanned drawings where OCR misreads the title
- Non-standard naming conventions (small firms, residential projects)
- Combined or multi-trade sheets where the title only mentions one discipline
- Pages with no title block (details, diagrams extracted from larger sheets)
Layer 2: Text Content Analysis
Beyond the title, the text content of a page provides additional classification signals. Even before applying AI vision, the system can analyze extracted text to identify:
- Trade-specific terminology — "conduit," "branch circuit," "luminaire type" indicate electrical; "ductwork," "CFM," "supply air" indicate mechanical; "waste line," "fixture unit," "vent stack" indicate plumbing
- Schedule indicators — repetitive tabular text patterns suggest a schedule rather than a plan
- Specification references — IS code numbers, ASTM references, or CSI section numbers indicate discipline
- Equipment tags — tag numbering patterns (E-xx for electrical, AHU-xx for mechanical, P-xx for plumbing) identify trade presence
Text analysis fills the gap when title analysis is ambiguous. A page titled "DETAIL 7" does not tell you what discipline it belongs to, but if the text contains "duct transitions" and "CFM ratings," the discipline is clearly mechanical.
Layer 3: AI Vision Classification
When deterministic signals (title + text) are insufficient or ambiguous, AI vision examines the visual content of the page. A vision model analyzes the drawing's visual characteristics:
- Symbol density and type — is this page covered in device symbols (floor plan) or connected boxes (schematic)?
- Layout structure — is this a spatial layout (plan view) or a tabular structure (schedule)?
- Line patterns — are lines representing walls (architectural), ductwork (mechanical), piping (plumbing), or wiring (electrical)?
- Annotation density — heavily annotated detail drawings look different from clean plan views
Vision classification is the most powerful layer but also the most expensive (it requires an AI model inference). The system invokes it selectively — only for pages where the deterministic layers could not reach a confident classification.
Why Multi-Label Classification Matters
Here is a fact about construction drawings that single-label classifiers miss: many pages contain multiple disciplines.
Examples that are common in real bid sets:
- Combined electrical plans — a single sheet showing both power and fire alarm devices (common in smaller projects to reduce sheet count)
- Architectural/electrical overlays — the electrical plan drawn over the architectural floor plan background, with both architectural and electrical information present
- Multi-trade detail sheets — a single sheet with details from mechanical, electrical, and plumbing (common for MEP coordination drawings)
- Mixed schedule sheets — a page with both a panel schedule and a lighting fixture schedule
A classifier that assigns only one label will miss content. If a combined power/fire alarm sheet is classified as "electrical power plan" only, the fire alarm extractor never runs on it, and all fire alarm devices on that sheet are missed from the takeoff.
Multi-label classification identifies ALL trades present on a page and invokes the appropriate extractor for each. This is more expensive (more extractors run per page) but far more accurate for the real-world drawing sets that contractors work with.
The Routing Decision: Which Extractors to Invoke
Classification determines trade presence. The routing step then maps trade presence to specific extraction modules. This mapping is not always one-to-one:
A page classified as "electrical" might invoke different extractors depending on the page type:
- Electrical + floor plan → symbol counting extractor (receptacles, luminaires, switches)
- Electrical + schedule → table parsing extractor (panel schedules, equipment schedules)
- Electrical + schematic → topology extractor (single-line diagrams, control schematics)
- Electrical + riser → vertical routing extractor (power risers, feeder diagrams)
The same trade label leads to different extraction approaches depending on the drawing format. This is why classification needs both discipline AND type information — knowing it is "electrical" is not enough; knowing it is an "electrical panel schedule" determines the extraction strategy.
What Happens When Routing Goes Wrong
Routing errors have cascading effects:
False Positive (Wrong Extractor Invoked)
- Mechanical extractor runs on an architectural plan
- Reports "ductwork" for every thick line (walls, actually)
- Takeoff contains phantom mechanical items that inflate quantities
False Negative (Correct Extractor Not Invoked)
- Fire alarm extractor does not run on a combined plan
- All fire alarm devices on that sheet are missed
- Takeoff underreports fire alarm quantities, potentially by entire floors
Type Mismatch (Right Discipline, Wrong Format)
- Panel schedule is classified as a floor plan
- Symbol counting extractor runs instead of table parser
- Extracts nothing useful because there are no symbols to count
Each of these errors is invisible to the downstream extraction — the extractor does its job correctly given what it was told to do. The error is in what it was told to do.
How Routing Accuracy Is Measured
Routing accuracy is not a single number. It breaks down into:
| Metric | What It Measures |
|---|---|
| Discipline recall | What fraction of actual trade presence is detected? (Missing a trade = false negative) |
| Discipline precision | What fraction of detected trades are actually present? (Phantom trade = false positive) |
| Type accuracy | Given correct discipline, is the drawing type (plan/schedule/schematic) correctly identified? |
| Routing accuracy | Given correct classification, is the right extractor selected? |
In production, discipline recall is the highest-priority metric — it is worse to miss a trade entirely (causing underreported quantities) than to invoke an extra extractor that finds nothing (wasted compute but no incorrect data).
Practical Impact on Takeoff Accuracy
We can quantify the impact of routing quality on final takeoff accuracy:
| Routing Accuracy | Approximate Takeoff Accuracy | Impact |
|---|---|---|
| 95%+ (good routing) | 85–95% | Minor manual corrections needed |
| 85–95% (adequate routing) | 70–85% | Moderate review effort; some items missed or miscategorized |
| Below 85% (poor routing) | Below 70% | Extensive manual correction; often faster to redo manually |
The relationship is not linear — a small improvement in routing accuracy produces a disproportionate improvement in takeoff accuracy because routing errors affect entire pages (dozens of items) rather than individual items.
The Flywheel Effect
Intelligent page routing creates a positive feedback loop:
- Better classification → right extractors invoked → more accurate takeoff
- More accurate takeoff → less manual correction → faster review
- Faster review → more projects processed → more data on drawing types
- More data → better classification models → better classification
Each drawing set processed improves the system's ability to classify the next one, especially for edge cases and non-standard conventions.
What This Means for Evaluating AI Takeoff Tools
When evaluating AI takeoff software, the routing layer is the first thing to test:
- Upload a mixed drawing set — include multiple disciplines, different drawing types (plans, schedules, schematics, details), and some edge cases (combined sheets, non-standard naming)
- Check the classification — does the system correctly identify what each page is? Does it detect multiple trades on combined sheets?
- Look at the extraction scope — are ALL disciplines extracted, or only the ones the system classified correctly?
- Test with real drawings — demo drawings are always clean. Your actual project drawings are the real test.
A system with mediocre extraction models but excellent routing will outperform a system with superior extraction models but poor routing. The routing is the foundation everything else builds on.
Aginera's page routing uses hybrid deterministic + AI classification with multi-label trade detection across 30+ drawing type categories. Upload your drawings and see how every page gets classified. Try it free.