March 2026
Below you can find an overview and the detailed methodology for the QS Global Student Flow
Contents
- About QS Global Student Flows
- Understanding the Sources Behind Global Student Flow Data
- Forecasting Global Student Flows
- Validation
- The Global Student Flows Model
- Why Cities Matter
About QS Global Student Flows
The Global Student Flows (GSF) dataset was created to address a major gap in international education data: the lack of a single, authoritative, standardized source of information on global student mobility. While international student movement represents a significant and growing global phenomenon, the available data is fragmented, inconsistent, and often difficult to compare across countries.
This dataset aims to build a comprehensive system that not only consolidates global data but also standardizes and forecasts international student flows, offering stakeholders clearer and more actionable insights.
The Core Challenge: Inconsistent and Non-Comparable Data
International student data is collected and published by a range of sources, including national ministries of education, immigration departments, private organizations, and international bodies such as UNESCO and OECD. However, these sources differ in how they define and measure “international students.”
Some countries classify students by citizenship, others by residency status or visa category. Reporting methods, timelines, and coverage vary significantly. In many cases, reporting to international platforms is voluntary, which can lead to incomplete submissions and data gaps across years or countries.
As a result, direct comparisons between destinations are often unreliable. The lack of consistency and completeness limits the ability of institutions, governments, and industry stakeholders to make informed decisions.
Creating a Standardized and Forward-Looking Framework
The Global Student Flows initiative is designed as more than a static dataset. It functions as an integrated framework and forecasting engine. At its foundation is a standardized global dataset constructed by harmonizing diverse national data sources.
This dataset feeds into a structured model built using sound methodological principles. The system combines quantitative data processing with expert input to produce validated outputs and forecasts. The emphasis on forecasting allows stakeholders to anticipate future mobility trends rather than relying solely on historical figures.
By integrating technical modelling with domain expertise, the framework seeks to deliver both accuracy and practical relevance.
Geographic and Structural Coverage
Our dataset captures:
- Student flows across ~200 countries and ~3,000 cities.
- Historical coverage from 2014 to the present.
- Forecast projections extending to 2030.
Market Significance and Stakeholder Value
The international student market is valued at approximately $500 billion, underscoring its economic importance. Reliable and comparable data has value across multiple stakeholder groups.
For governments and policymakers, standardized and forecasted data can support visa policy design, national education planning, and economic strategy. For universities and higher education institutions, insights into future student flows can inform recruitment strategies, budgeting decisions, and long-term planning.
By addressing current data gaps and inconsistencies, the initiative aims to provide a shared evidence base for more strategic decision-making across the international education sector.
Data Standardization and Estimation Challenges
One of the primary challenges in building a global system lies in standardization. Even countries with strong reporting mechanisms often use definitions or categories that must be adjusted to fit a consistent global framework. This requires extensive cleaning, harmonization, and methodological alignment.
An additional challenge arises from countries that do not report international student data at all. In these cases, estimation techniques are used. These methods draw on regional patterns, historical trends, and comparable country data to produce reasonable projections. Ensuring transparency and methodological rigor in these estimations is critical to maintaining credibility.
Organizational Structure and Capabilities
The initiative is supported by a multidisciplinary team organized into three core functions:
- Model Development: Data scientists responsible for building, maintaining, and improving the forecasting model for country-to-country level student flows.
- Data and Forecasting: Analysts who collect, forecast, and interpret global data and produce regional reports.
- Research and Development: A specialized function focused on developing detailed datasets at the award level, city and subject level.
This structure enables the integration of data science, applied research, and granular geographic analysis within a single system.
Understanding the Sources Behind Global Student Flow Data
Global student flow data aims to measure how students move across borders for higher education. To build a reliable and globally comparable dataset, the model draws on a wide range of sources, applies consistent definitions, and carefully processes the information. Although many external datasets contribute to the process, the final published data reflects significant analysis, cleaning, and modelling, making it a unique, consolidated output.
GSF Data Sources
The foundation of the dataset comes from official national sources. These include government departments, education ministries, immigration authorities, and other official statistical agencies. When available, government data is prioritized because it is generally the most authoritative and detailed.
However, not all countries publish complete or consistent information. To address this, the model also incorporates international data repositories such as:
- UNESCO
- OECD
- Project Atlas
When official and international datasets are incomplete, additional information may be drawn from reputable articles or reports. All external data undergoes processing and alignment before being integrated into the model.
Defining “Student Flows”
A clear and consistent definition is essential for comparability. Student flows are defined as enrolment data in higher education, specifically:
- Undergraduate degree programs
- Postgraduate degree programs
- Language programs (where relevant, such as in countries with large language study sectors)
Extended post-study work visa holders (for example, graduates on work permits) and students who completed their prior schooling in the host country but are classified as international students based on nationality are excluded. This ensures that the dataset focuses strictly on students actively enrolled in degree-seeking or approved short-term academic programs who travel for the purpose of higher education.
Confidence Scores and Data Weighting
To reflect the reliability of different sources, each data input is assigned a confidence score between 0 and 1.
These scores act as weights in the model. Higher-confidence data has greater influence on the final output, while lower-confidence data contributes proportionally less. When multiple sources exist for a country, weighted averages are calculated to determine the most robust estimate.
This structured weighting system increases transparency and strengthens methodological rigor.
Update Frequency and Ongoing Improvements
The dataset is updated on a quarterly update cycle, allowing structured and predictable revisions. In addition to regular updates, special releases are incorporated when major datasets become available. This combination of scheduled and event-driven updates ensures that the dataset remains current, responsive, and aligned with newly released information.
From Multiple Sources to a Unified Output
Although the model draws on a wide range of external data—from government ministries to international organizations—the final output reflects a carefully processed, standardized, and weighted dataset. Through clear definitions, systematic adjustments, and confidence-based weighting, the result is a globally comparable view of student mobility that goes beyond any single source.
By combining multiple inputs into a unified methodology, the model provides a consistent and structured perspective on global higher education flows.
Forecasting Global Student Flows
This section focuses on how global student mobility is forecast over a five-year horizon using a structured process to analyse international student flows between source countries (where students come from) and destination countries (where they study). Forecasting in this field is complex, but by combining high-quality data, structured frameworks, and scenario planning, we develop informed projections that support planning and decision-making.
Step 1: Building a Reliable Data Foundation
The forecasting process begins with collecting data from credible sources, including government agencies and international organisations. These sources provide statistics on international student numbers, migration policies, economic indicators, and demographic trends.
Once the data is gathered, analysts identify and fill gaps to create a complete and consistent dataset. Ensuring accuracy and completeness at this stage is critical, as the quality of the forecast depends heavily on the reliability of the underlying data.
Step 2: Applying a Structured Forecasting Framework
After preparing the dataset, the next step is forecasting. This involves analysing historical trends and identifying key drivers of student mobility. The framework used groups drivers into three main categories:
- Push factors – Conditions in a student’s home country that encourage them to study abroad (e.g., limited domestic education capacity or employment opportunities).
- Pull factors – Attractions offered by destination countries (e.g., favourable visa policies, strong employment prospects, or government recruitment targets).
- Disruption factors – External risks that can affect both source and destination countries (e.g., global crises).
Our analysts examine past fluctuations in student numbers and the reasons behind them to assess whether similar patterns may occur in the future.
Step 3: Validating the Forecast
Forecasts are not produced in isolation. Once initial projections are developed, they undergo internal review and validation. Experts assess assumptions, compare findings with real-world developments, and refine outputs before finalising the forecast. This process helps improve credibility and practical usefulness.
Understanding Forecast Accuracy
Forecasting is inherently uncertain—no one can predict the future with complete precision. However, long-term global trends in international student mobility have historically shown steady growth, averaging around 3–4% annually over past decades. At a high level, this makes overall global growth relatively predictable.
Uncertainty increases at more granular levels, such as specific destination countries or individual source-to-destination corridors. To manage this uncertainty, we use tools including Confidence intervals, which provide upper and lower ranges around projections and scenario analysis, which models different possible futures.
Scenario Planning: Preparing for Multiple Futures
Scenario analysis is a core tool used to manage uncertainty. Three main scenarios are typically considered:
1.Regulated Regionalism
where geopolitical fragmentation leads to strong intra-regional mobility and emerging destinations accelerate ahead.
2. Hybrid Multiversity
A world of blended, tech-enabled models that reshape where and how students learn, featuring a strong push towards transnational campuses.
3. Talent Race Rebound
A high-growth, globally competitive environment where nations aggressively seek international students as future citizens and workers.
Advanced Modelling: Simulating Possibilities
Advanced statistical techniques are applied to further refine the forecasts. In particular, Monte Carlo simulation is used to model a wide range of potential outcomes across source countries and policy or market scenarios.
For a given destination, the analysis does not rely solely on a single estimate of total inbound students. Instead, source-country-level inflows are modelled individually under each scenario, reflecting differences in economic conditions, policy settings, visa dynamics, and demand sensitivity. These country-specific projections are then aggregated across thousands of simulated iterations.
This approach generates a distribution of possible outcomes for a given study destination rather than a point forecast, enabling identification of the most probable trajectory as well as the associated uncertainty range.
Validation
Why Validation Matters
Validation is one of the most important stages in the Global Student Flows model. Many stakeholders rely on these numbers to inform decisions, publish insights, and guide strategy. Because of this, trust in the data is critical. Validation ensures that the figures produced by the model are accurate, consistent, and dependable.
Starting with Strong Inputs
The foundation of any reliable model is high-quality input data. Validation begins at the input stage, where historical data is collected, gaps are identified, and estimates are made when necessary. These figures are then carefully reviewed to ensure they make sense within the broader education landscape of each country or destination.
A key focus at this stage is comparability. Data must meet consistent standards across countries and align with the defined scope of what counts as a “student flow.” Ensuring that all inputs are accurate and aligned with these definitions is essential, as the input data forms the backbone of the entire model.
Rigorous Forecasting Checks
After validating historical inputs, the process moves into forecasting. Forecasting follows a rigorous approach that combines both quantitative and qualitative checks.
Quantitative checks assess whether the model’s projections are statistically sound. At the same time, qualitative checks consider real-world factors that may influence future student mobility—such as policy changes, economic conditions, or shifts in student demand. Different scenarios are also explored to test how sensitive forecasts are to changing conditions.
The goal is to ensure that forecasts represent the best possible estimate based on both data and contextual insight.
Output-Level Quality Assurance
Validation does not stop once the model produces results. The output itself undergoes additional quality assurance checks.
One key step is verifying that major inputs and forecast assumptions have not changed unexpectedly during the modelling process. If the overall “story” told by the data shifts dramatically after running the model, it signals the need for further investigation.
The outputs are also reviewed to ensure internal consistency and logical coherence. This helps confirm that the final numbers are aligned with both the validated inputs and the broader trends in global education.
Engaging Internal and External Experts
Validation is strengthened through expert review. Both internal specialists and external consultants contribute at various stages of the process.
External experts who work directly in student recruitment provide on-the-ground insights into student demand, market conditions, and perceptions. These qualitative insights help test whether the numbers reflect real-world dynamics.
Internally, subject-matter experts challenge assumptions, test forecasting logic, and ensure the analysis stands up to scrutiny. Their engagement helps refine the model and ensures the outputs answer the kinds of questions stakeholders are likely to ask.
The Global Student Flows Model
The Global Student Flows Model is a computational framework designed to estimate and forecast international student mobility from source city to destination country level, with projections to 2030. The model integrates statistical inference, machine learning, and expert-informed constraints to generate a consistent, globally reconciled view of student flows over time.
The model uses the standardized data and projections produced by our analysts as inputs and paints a full picture of the global student landscape spanning billions of data points. the model forecasts global student flows while filling critical information gaps and reconciles all estimates to ensure total consistency in the final output.
Model Architecture
The model operates in three primary stages: historical reconstruction, forecasting and rebalancing. The model reconstructs historical student flows by filling data gaps through statistical inference, establishing a reconciled baseline for its 2030 projections. These forecasts utilize a hybrid methodology—blending expert-informed growth rates (CAGR) with historical time-series modelling—to ensure trends are grounded in both empirical data and real-world policy contexts. A final rebalancing phase aligns bottom-up bilateral flows with top-down country constraints, guaranteeing that all outputs remain internally consistent and structurally sound.
Historical Reconstruction (2014–Present)
This stage constructs a complete, reconciled dataset of historical student flows across source countries and destination countries.
Key steps include compilation of inputs and data gap filling where statistical inference techniques are used where official data is incomplete or unavailable. A full grid of flows is constructed to ensure continuity across years.
Forecasting to 2030
Once the historical dataset is reconciled, the model generates forward projections using a hybrid forecasting approach.
The model combines Compound Annual Growth Rates (CAGR) and time-series modelling. Compound Annual Growth Rates are expert-informed growth assumptions developed through structured research. These reflect country-level outlooks, demographic trends, and policy trajectories. The time-series modelling framework identifies patterns in historical data with past values and trends informing projected values.
Using both approaches ensures that forecasts are grounded in observed statistical patterns and anchored in real-world expertise and policy context.
Rebalancing and Reconciliation
Because the model applies both bottom-up (bilateral flows) and top-down (country totals) inputs, reconciliation is required.
A rebalancing stage ensures that
- Total outbound flows from each source country match specified totals.
- Total inbound flows to each destination country align with constraints.
- Bilateral flows remain internally consistent within the system.
Rebalancing is a core statistical process within the engine. It allows the model to adjust bilateral flows proportionally while preserving structural relationships and constraints.
Why Cities Matter
Cities are the most detailed geographic level in the Global Student Flows (GSF) model. While country-level data is widely available, it often lacks the precision that universities, policymakers, and education providers need. Institutions don’t just want to know how many students are coming from a country—they want to know which cities those students are coming from.
City-level insights allow stakeholders to better understand local demand, target recruitment strategies more effectively, and identify emerging urban education hubs. However, detailed data at this level is rarely available, making city-level modelling both valuable and complex.
Building a Global Cities List
To create meaningful city-level analysis, the model includes approximately 2,700 cities across 200 countries. The goal is to achieve global coverage while keeping the dataset manageable and relevant.
From Country Flows to City-Level Insights
Mapping student flows from countries to thousands of cities requires a structured methodology. Because direct city-level outbound mobility data is rarely available, the model estimates each city’s share of national outbound students using several core indicators:
- City Median Wage
- City Population
- English proficiency levels and international schools
- Domestic migration within a country
- Likelihood or Ranking Factor (A measure of a city’s relative propensity to send students overseas)
By combining these factors, the model allocates national outbound student totals proportionally across cities, producing estimated city-level shares of international mobility.
Validating the Numbers
City-level estimation is only useful if it is credible. Once the model generates outputs, extensive validation processes are applied.
Validation methods include:
- Ground-level research: Reviewing local reports, publications, and regional data sources where available.
- Consistency checks: Ensuring that city-level totals align with trusted country-level data.
- Data visualization: Using maps, charts, and geographic overlays to identify outliers or inconsistencies.
As many cities do not publish outbound student data, estimation remains challenging. The process relies on cross-checking multiple signals and continuously refining assumptions to improve reliability.
