Methodology

How we turn raw records into usable property intelligence.

Turkish real estate data is fragmented across TKGM, municipal zoning authorities, TÜİK, BDDK, MTA and EFEHR. We normalize, reconcile and version it through a six-stage pipeline with explicit coverage and missingness at every step.

The pipeline

Six stages, zero shortcuts.

RAW INGESTION Raw Source

Raw source records, files, geometry and period data.

BRONZE EVIDENCE Evidence Layer

Immutable evidence, manifest and source trace.

SILVER NORMALIZATION Normalization

Type, geometry, identity and date standards.

GOLD AGGREGATES Decision Outputs

Decision-ready domain outputs.

INTELLIGENCE PROFILE Intelligence Profile

Property, parcel, district and market context.

DOSSIER / API SERVING Dossier / API

Report, screening result or JSON payload.

Stage detail

What happens at each layer.

Raw Ingestion

Source documents are ingested without modification. TKGM cadastral records, municipal zoning PDFs, statistical releases, geological rasters, and project manifests are all stored in their original form with an ingestion timestamp and source identifier.

Bronze Evidence

Each raw record is parsed into structured evidence fragments. Field types are validated, encoding issues are fixed, and each fragment is tagged with its source family, record type, and ingestion batch.

Silver Normalization

Evidence fragments are normalized to canonical field definitions. Address text is parsed and geocoded. Parcel IDs are resolved to canonical property identifiers. Zoning codes are mapped to a unified taxonomy. Units and date formats are standardized.

Gold Aggregation

Normalized fields from multiple source families are merged per canonical property. Conflicts are resolved by source priority and recency. Missing fields are explicitly marked — inferred when cross-source evidence permits, unknown otherwise.

Intelligence Profile

The gold aggregate is assembled into a property intelligence profile: canonical identity, zoning and building context, risk signals, market activity, macro context, and a full provenance trace for every field.

Dossier / API Serving

Intelligence profiles are served as structured API payloads, human-readable reports, or batch exports. Every output carries the same Dossier Evidence Contract: source, freshness, confidence and coverage state per field.

Core principles

What we do not compromise on.

Explicit missingness

Every field that cannot be populated carries an explicit state: unknown, inferred, or unsupported. We never interpolate to hide gaps.
Source provenance

Every field in every output traces back to a specific source, ingestion batch and timestamp. Nothing is produced without a traceable origin.
Determinism

The same property at the same pipeline version and the same data state always produces the same output. Our pipeline is not probabilistic at the record level.
Scoped coverage

We only claim coverage we can back with real data. Geographies and field types not yet supported are listed as unsupported, not silently omitted.

See the data sources behind the pipeline.

Data Sources Coverage Map