Nextyn IQ
Sign InBook a Demo
Expert Research IntelligenceWorking Paper

Multilingual Expert Calls: Accuracy, Nuance, and What Gets Lost in Translation

When expert calls cross language boundaries, something always gets lost. This working paper examines what that something is — and how to recover it.

Nextyn IQ Research10 min read

Abstract

This paper examines the structural accuracy and nuance challenges that arise when expert call research crosses language boundaries, with particular attention to APAC markets where primary research is often conducted in Mandarin, Japanese, Korean, Bahasa, and Vietnamese but consumed in English. The translation gap in primary research is not merely a logistical inconvenience — it is a systematic source of analytical error that most institutional research operations have not yet addressed with the rigour the problem demands. Three distinct layers of translation loss are identified and examined: literal accuracy, hedging language, and cultural context. For each layer, this paper describes the failure mode, its frequency in practice, and the infrastructure required to mitigate it. The paper concludes with a framework for building multilingual research operations capable of preserving analytical fidelity across language boundaries.

Methodology Note

This paper draws on analysis of 180+ multilingual expert call transcripts across six APAC markets. Where specific translation accuracy data is cited, it reflects blind back-translation testing conducted by professional linguists with sector-specific expertise.

The Scope of the Problem

The majority of global primary research is consumed in English. The majority of the world's most interesting investment opportunities are in markets where English is not the primary business language. This is not a new observation, but its practical implications for research quality remain underappreciated.

Consider the operational reality: an analyst in London or New York making investment decisions about a Vietnamese logistics company or a Korean semiconductor supplier is depending on translation infrastructure that most research operations have never systematically evaluated. The expert networks, research platforms, and independent analysts supplying that intelligence operate translation pipelines of widely varying quality — and those quality differences are rarely disclosed, rarely tested, and rarely incorporated into confidence assessments.

The scale of the problem is larger than most practitioners assume. In a typical APAC-focused research programme, between 40% and 70% of expert calls conducted in key markets will be conducted in a language other than English. This varies by market — Japan and Korea sit at the high end; Singapore and Hong Kong at the low end — but across the region, translated research constitutes the majority of the primary intelligence base for many investment theses.

The problem compounds at the thesis level. A single material error in a translated transcript may be inconsequential in isolation. When that error appears in three of twelve calls informing a key assumption — margin structure, competitive positioning, regulatory risk — the analytical downstream effect becomes significant. The challenge is that there is no reliable mechanism in conventional research operations to detect this.

Unique SignalEXP-06288/100
Former Research Analyst, Emerging Markets Fund

We ran 22 calls on a Southeast Asian fintech in local languages. When we had 5 of the transcripts independently back-translated, three had material errors in numerical claims — margin figures, market share estimates, headcount.

This is not an isolated finding. Back-translation testing — the practice of having a second linguist independently retranslate material from the translated version back to the source language — consistently surfaces material discrepancies in primary research transcripts that were not flagged during standard review. The question is not whether translation errors exist in multilingual research operations. The question is whether the research consumer has any means of knowing which claims to scrutinise.

Three Layers of Translation Loss

Translation loss in expert research does not occur uniformly. It operates across three distinct layers, each with different failure characteristics, different detection difficulty, and different implications for analytical reliability.

Layer 1: Literal Accuracy

The most obvious layer of translation loss is literal accuracy: when specific figures — margins, market share estimates, growth rates, headcount — are mistranslated, rounded incorrectly, or rendered with different units than the expert intended. This is the layer most practitioners think of when they consider translation risk, and it is the layer that contemporary translation tooling handles most adequately.

Modern machine translation systems are reasonably reliable for direct numerical claims in standard business contexts. When an expert states a specific percentage or figure in a structured interview context, that figure is usually preserved accurately in translation. The failure modes at this layer tend to be specific: unit confusion (percentage points versus percentage change), time period ambiguity (fiscal versus calendar year), and currency or market boundary qualifications that attach to a figure but are dropped in translation.

Layer 1 errors are also the most detectable. Back-translation testing catches them reliably. A quality-conscious research operation can implement numerical claim verification as a targeted QA step without reviewing entire transcripts, making this the most tractable of the three layers to address systematically.

Layer 2: Hedging Language

The most dangerous layer of translation loss is hedging language: the register of uncertainty, qualification, and indirect signalling that surrounds factual claims. Every language has a distinct hedging vocabulary, and the mapping between hedging registers across languages is systematically non-linear. A hedging term that signals mild caution in one language may have no equivalent in another — and a translator who renders it as the closest available English approximation will routinely understate or overstate the speaker's intended confidence level.

Japanese business communication illustrates this problem with particular clarity. Japanese has a highly developed system of indirect expression in professional contexts. A phrase that translates literally as "there may be some issues" is often a formal signal of a serious operational problem — the indirectness is culturally required, not an indication of uncertainty about the severity. A translator rendering this as "there may be some issues" has produced a technically accurate translation that conveys the opposite of the intended analytical signal.

Mandarin presents a parallel but different challenge. Certain indirect formulations in Mandarin business conversation indicate strong disagreement or rejection — the indirectness is a function of politeness norms, not genuine ambivalence. These formulations often translate to English as expressions of mild concern or qualified uncertainty, inverting the strength of the signal. A research consumer reading the translated transcript has no means of knowing that the expert's qualified language was in fact a firm negative assessment.

Layer 2 errors are dangerous precisely because they are not detectable through standard quality review. A back-translation test will not surface a hedging language error — the back-translated version will read as accurately as the original translation. Detecting hedging language errors requires a native-speaker reviewer with sector expertise who can evaluate the translated claim against the source language original, not just verify that the translation is internally consistent.

The number was translated correctly. The confidence behind the number was not. That's a harder problem.

Senior Research Associate, Pan-Asian Investment Fund

Layer 3: Cultural Context

The least visible layer of translation loss is cultural context: the frameworks of meaning that give an expert's statements their full analytical significance, and that have no direct encoding in the translated text. Cultural context loss is not a translation error in the conventional sense — the translated text is accurate. The problem is that accuracy is insufficient when the statement derives its meaning from a shared cultural framework that the English reader does not possess.

An Indonesian expert's assessment of a competitor "losing face" in a public dispute is a concrete example. The translated phrase is accurate. But "losing face" in the Indonesian business context carries specific implications for how a company's partners, customers, and prospective employees will respond — implications that are not encoded in the English translation and that an analyst unfamiliar with the cultural framework will not be able to reconstruct from the transcript alone. The claim has been accurately translated; its analytical significance has been lost.

Confucian business culture norms create a systematic cultural context problem across multiple APAC markets. An expert in Japan, Korea, or Taiwan speaking about a former employer's leadership will systematically soften negative assessments. This is not reluctance or incomplete disclosure in the conventional sense — it is a culturally required modulation of expression that is invisible in translation. The translated text will appear to contain a mildly qualified positive assessment; the source language expert may have been communicating serious concerns within the available register.

What Good Multilingual Research Infrastructure Looks Like

Addressing the three layers of translation loss requires purpose-built infrastructure, not incremental improvement to conventional research processes. Four requirements are necessary and sufficient for a multilingual research operation capable of preserving analytical fidelity.

1. Native-speaker moderators, not bilingual interpreters. The distinction matters operationally. A bilingual interpreter is a language professional. A native-speaker moderator with sector expertise is something different: a practitioner who can evaluate what an expert says in the source language against their own sector knowledge, identify gaps and inconsistencies in real time, and probe claims with contextually appropriate follow-up. The interpreter produces an accurate translation; the moderator produces an accurate intelligence signal. For research above a basic threshold of analytical ambition, interpreters are insufficient.

2. Real-time linguistic flagging. Moderators should flag hedging language and cultural context markers during the call, not post-transcription. This serves two purposes: it preserves the flag in the primary record (rather than relying on a post-hoc review that may not have access to audio), and it creates an opportunity for the moderator to probe the flagged claim directly with the expert. A real-time flag on a hedging phrase can be immediately followed by a clarifying question; a post-transcription flag cannot.

3. Dual-track transcripts. A verbatim transcript in the original language, retained as a permanent record, plus a structured English-language intelligence summary prepared by the native-speaker moderator. The intelligence summary is not a raw translation — it is a synthesis that incorporates the moderator's linguistic and sector expertise, explicitly surfaces hedging language and cultural context flags, and separates high-confidence factual claims from qualified or contextually complex assessments. Raw translation alone is analytically insufficient for any non-trivial research purpose.

4. Back-translation QA for high-conviction claims. Any claim above a confidence threshold — in practice, any claim that will bear material analytical weight — that originates from a non-English source should be back-translated by a second linguist who has not seen the original translation. This applies particularly to numerical claims with attached qualifications, assessments of competitive position, and statements about regulatory or structural risk. Back-translation is not a general solution to the hedging language and cultural context problems — it addresses Layer 1 and partially addresses Layer 2 — but it provides a meaningful quality floor for high-stakes claims.

Unique SignalEXP-07482/100
Former Operations Lead, Regional Consumer Platform

When I was being interviewed about our growth metrics, I gave a figure in Thai that included a specific regional qualification. The English transcript just showed the number. The qualification would have changed the investment interpretation entirely.

These four requirements are not aspirational. They are operationally achievable for any research operation with a clear-eyed view of what multilingual primary research is actually for. The cost is real — native-sector moderators command significant premiums over bilingual interpreters, and dual-track transcripts require more production time than raw translations. The cost is also front-loaded: it is incurred in building the infrastructure, not distributed invisibly across every call. Research operations that have made this investment consistently report that the analytical improvement in translated research is sufficient to justify it on research quality grounds alone, without needing to invoke error-avoidance rationales.

APAC-Specific Considerations

The general framework above applies across APAC markets, but each major research language has specific characteristics that warrant explicit consideration.

Mandarin. A tonal language with sector-specific jargon that evolves rapidly, particularly in technology and fintech. Analysts should be aware that Mainland Mandarin, Taiwanese Mandarin, and Singapore Mandarin have diverged in meaningful ways at the sector vocabulary level — a moderator calibrated for one market may not be fully calibrated for another. The Mandarin hedging register is particularly rich and consequential for research interpretation; moderators should be explicitly briefed on the importance of flagging indirect expression.

Japanese. Keigo — the formal honorific register of Japanese — can obscure direct assessment in ways that are structurally invisible to an English reader. Japanese experts with professional norms around discretion will frequently omit information about former employers rather than state it negatively. Research consumers should be trained to read omissions in Japanese expert transcripts with the same analytical attention as explicit claims. What is not said in a Japanese expert call is often as informative as what is said.

Bahasa Indonesia and Bahasa Malaysia. Similar languages with meaningful distinctions at the business vocabulary level. Indonesian business language has absorbed a different set of English loanwords and technical terms than Malaysian, and the two have diverged further in digital economy and fintech vocabulary over the past decade. A moderator who is a native speaker of one should not be assumed to be equally calibrated for the other. The distinction matters for research in industries where the two markets have developed at different rates.

Korean. Korean business communication is deeply hierarchy-embedded. Senior executives discussing junior colleagues' performance, or experts commenting on the conduct of organisations where hierarchy differentials were present, will modulate their language in ways that reflect those differentials. A senior expert's apparently measured assessment of a junior colleague's decision may be a strong negative signal within the Korean hierarchy register. Moderators should flag hierarchy-referencing language explicitly.

Vietnamese. A newer and rapidly growing market for expert network research. The pool of qualified bilingual sector experts with both deep market knowledge and the communication fluency required for productive research calls is more limited than in more established APAC research markets. This creates a specific risk: a higher proportion of Vietnamese expert calls will involve participants who are communicating in English at the edge of their professional vocabulary, creating translation uncertainty that runs in the opposite direction — the expert's English expression may not fully capture what they intend to say. Vietnamese research programmes require extra QA budget and moderator attention.

Closing

Multilingual expert research is not harder than English-language research — it is differently hard. The skills required are different, the quality assurance infrastructure required is different, and the analytical literacy required to consume the output is different. Research operations that treat multilingual calls as straightforward translations of the standard English-language call process will systematically underperform the analytical potential of their source material.

The teams that build the right infrastructure for multilingual research gain access to markets that their less-prepared competitors are effectively locked out of — not because those competitors cannot conduct calls in local languages, but because the intelligence they extract from those calls is systematically degraded. In highly competitive research environments, that degradation is not a minor inefficiency. It is an information disadvantage that compounds across every investment thesis that touches non-English-language markets.

The cost of getting multilingual research right is front-loaded: the investment in native-sector moderators, dual-track transcript infrastructure, and back-translation QA protocols is concentrated at the programme-build stage. The cost of getting it wrong is concentrated at the investment thesis. For research operations that take analytical fidelity seriously, the allocation decision is straightforward.