Across China, an Unseen Rural Workforce Is Shaping the Future of AI
SHAANXI, Northwest China — Three years ago, He Jie was a stay-at-home mom in Qingjian, one of China’s poorest counties, running her home on the modest salary her husband earned as a migrant electrician.
Now, she’s an integral part in the booming AI industry: Every click and scroll she makes is a quiet yet crucial contribution to the vast and intricate process of machine learning.
On a Wednesday in October, 35-year-old He carefully examines route signs and guideposts on images of roads. An AI labeler, she identifies any discrepancies between the pictures of the same stretch of road and updates the navigation platform so users have access to accurate and current information.
AI labeling, also known as data annotation, is the process of tagging and categorizing vast amounts of data to train AI models, from identifying and categorizing images to annotating text for natural language processing. Carried out by human labelers, this task — often time-consuming and labor-intensive — is essential for the development of all AI applications, including self-driving cars, natural language processing, and machine learning.
He Jie’s foray into this industry began in 2020 when she joined Qingjian Aidou, an AI training company. At the time, she was battling postpartum depression and desperately sought a change of pace. The job gives He a sense of purpose and adds to her husband’s income, who works as an electrician in Yinchuan, in the neighboring Ningxia Hui Autonomous Region.
This pivot from her previous life is not just a personal success story — it reflects a larger trend in Qingjian and similar towns across the country. As technology firms extend their reach into rural China, such jobs offer financial stability and also allow women like He to balance work with their responsibilities as mothers. She told Sixth Tone: “I like the person I’ve become.”
Her company tackles a wide range of tasks. For example, they might label a picture of a cat with the word “cute” or categorize waste for recycling. Their responsibilities also extend to organizing written text into distinct paragraphs, among others.
While AI labeling may seem like a daunting task for unskilled workers at first, it’s relatively easy to learn and manage with practice. It’s why AI labeling has been a boon for unskilled workers, particularly women, people with disabilities, and those from marginalized groups.
Recognizing its growing importance, China named AI labeling as an emerging profession in 2020. The move helped legitimize the field and opened doors for support and growth, while offering existing professionals a competitive edge in the job market.
In the two years since, China’s data labeling market grew to 5.08 billion yuan ($713 million), a 17.3% increase from the previous year, according to data from research agency Insight and Info. And according to one expert, there are nearly 2 million AI labelers in the country.
Local governments also recognized its potential in aiding poverty alleviation efforts, particularly in remote and underdeveloped areas with scarce industrial resources, just like Qingjian. It’s also why most data labeling factories are increasingly being established in such regions, capitalizing on the lower costs of labor and office space.
For instance, a labeler may earn around 10,000 yuan per month in the eastern city of Hangzhou, while in Qingjian, the salary hovers between 4,000 and 5,000 yuan. Despite the lower wages compared to bigger cities, for He and many women in her community, data labeling is still one of the best-paying jobs available locally.
But in recent months, the longstanding reliance on human labelers for data accuracy in the AI industry is being challenged by the very technologies they’ve helped advance. Sophisticated AI models like ChatGPT are surpassing human skills in data annotation, signaling a shift towards automation is already reducing the demand for human involvement.
Moreover, the intense demand for precision has sparked fierce competition, not only pressuring smaller firms like Qingjian Aidou but drawing attention from industry giants, who are leveraging their scale and advanced AI to redefine efficiency and accuracy in the field.
Beyond economics
Before starting at Qingjian Aidou, all employees are required to attend a training course in data labeling and curation. Mastering these new skills posed a significant challenge for He and her colleagues.
Trained as a nurse from a local vocational school, He is among the best educated at her company, where most have only completed middle or high school education. “I’ve seen several colleagues resign due to the demanding nature of the training courses,” she says.
She also acknowledges the work’s demanding nature, particularly the need for near-perfect accuracy — the office standard for her current transport project is set at 99%. Then there’s the tedium of repetitive tasks. For example, an AI labeler might be asked to label thousands of images of cats and dogs, making sure to correctly identify each animal in every image. It can be mind-numbing, and easy to make mistakes.
However, despite the challenges, He finds the work rewarding in several ways: She enjoys the sense of accomplishment from completing a difficult task and she values the financial independence the job provides. But more importantly, the job is largely sedentary, making it particularly suited to her circumstances. A disability in her right leg, stemming from birth-related brain hypoxia, limits her ability to perform physically demanding tasks.
He Jie is compensated per image: The complexity of the image and the project’s specific requirements determine her pay rate, which can range from a few fen (one fen equals one-hundredth of a yuan) to a few yuan per image.
And with her experience, she reviews more than 20,000 images each day, often averaging just a few seconds on each one. This nets her between 4,000 to 5,000 yuan a month — a substantial sum in a town where government employees typically earn around 2,000 yuan.
As a stay-at-home mom, she often refrained from personal expenditures to stretch her husband’s salary further. Now, with her own income, she enjoys the financial freedom to indulge in nicer clothes and cosmetics for herself.
At Qingjian Aidou, her story is one of many. Colleague Li Fang, also a mother of two, has found economic independence and an empowered role in her family’s decisions. “My husband believes our daughter’s dance lessons are frivolous, but now I support her passion with the money I earn,” says Li.
According to Yang Jianghua, a sociology professor at Xi’an Jiaotong University, whose team has researched digital employment in 19 counties, including Qingjian, the data labeling industry’s social impact in less developed areas often outweighs its economic benefits.
For decades, he says, residents of rural China have migrated to cities for work, leaving families behind, and those that return tend to pursue civil service exams, seeking the job security they offer.
But in Qingjian now, about 70% of AI labeling company employees are mothers from the ’90s generation. Many grew up without parents since they were always away working, so such local digital jobs are crucial, says Yang.
“Many mothers who have once been the left-behind children themselves said they will do whatever is needed to stay together with their children,” says Yang. “Without the stabilization of this group, it is highly likely that the real estate market at the county level could collapse due to residents leaving for larger cities.”
Social impact
Nestled in China’s Loess Plateau and cradled by a valley along the Qingjian River — a tributary of the Yellow River — Qingjian is the birthplace of the renowned novelist Lu Yao. And it mirrors the northern Shaanxi countryside depicted in his best-selling novels.
Yet, its mountainous terrain and seclusion place it off the usual path for industrial development. The local economy is primarily agricultural, but climate change, manifesting in heavier autumn rains, has adversely affected the harvest of the jujube fruit, a regional specialty, prompting numerous farmers to leave their lands.
In 2020, Qingjian was among the last batch of counties to announce the eradication of absolute poverty. Now, despite numerous high-rise buildings signaling modernization, many residents, including He, still reside in traditional yaodong — earthen dwellings typical of the Loess Plateau. The county’s average annual disposable income stood at 20,620 yuan in 2022, still lagging behind the national average of 36,883 yuan.
For years, the local government’s poverty alleviation initiatives largely concentrated on agriculture-based projects, according to Yu Tao, the general manager of Qingjian Aidou. Before he joined the company, Yu served as the vice president of the Qingjian County Urban Investment Company, where he spearheaded local agriculture projects.
Like other struggling counties, Qingjian funneled money into projects like pig farming and the fruit industry before the 2020 poverty deadline, says Yu, adding, “But almost all of those initiatives ended in failure because we had neither the proper resources nor the talent to run them.”
But since the region already had high-speed internet in place, the county quickly pivoted to the AI labeling industry to fight poverty. And it didn’t require much investment to start either.
The Aidou project, started by the China Women’s Development Foundation, Ant Group, and Ant Charity Foundation, arrived in Qingjian in 2019, aimed at creating digital economy jobs in rural areas.
But the journey hasn’t been easy, says Yu. “Hiring was difficult in the beginning because the job seemed alien to locals. Moreover, candidates were skeptical because most often still associated computer-related jobs with online addiction.”
Slowly, Qingjian Aidou’s influence grew, particularly during the pandemic, when the company — the only one offering data labeling jobs in the county — facilitated remote work, and above-average salaries for its employees.
Operating as a social enterprise with government backing, it prioritizes employee compensation and welfare. And mindful of its largely female workforce’s needs, it also supports flexible working hours and childcare, making it a valuable employer in the region.
This model has now been replicated in nearby counties, with multiple companies under the Aidou projects providing jobs to over 900 locals and highlighting the potential of digital employment in rural areas.
The change dilemma
Currently, machine learning technologies depend greatly on the “human-in-the-loop” approach, where input from AI labelers is crucial for teaching machines.
According to research firm Cognilytica, data-related tasks take up over 80% of AI project time, with a quarter of that devoted to data labeling. Simply put, the quality of data labeling significantly enhances AI intelligence.
And with such high demand for precision, the industry has grown fiercely competitive: Together with advancements in AI technology, such as ChatGPT, there is pressure from larger companies, which are now beginning to outperform human labelers in several tasks.
Long-term challenges loom as well. Recent research from the University of Zurich indicates that ChatGPT can surpass crowdworkers in labeling data for relevance, stance, and topic identification. It means tech giants are increasingly turning to AI automation over human labor, prioritizing the gains in accuracy and efficiency it offers.
Though Qingjian Aidou still receives a steady stream of orders, several employees admitted that there’d been a noticeable slowdown and a dip in pay while the complexity of tasks has increased.
“Since the start of the year, many of the simpler tasks once handled by human hands have shifted to automated AI processes,” says He.
In the early days, work was plentiful, allowing He to earn up to 8,500 yuan monthly with extra effort and overtime. This is now less common. According to He, many of the simpler tasks, once the domain of human laborers, have been shifted to automated AI processes. Her team is increasingly focused on the more intricate work like route planning for navigation tools.
“We often discover that the difficulty has increased after one or two weeks,” says Yu, adding that the new tasks now often involve specialized fields requiring knowledge in finance and medicine, which has forced the company to raise its educational requirements for employees.
Initially, the work involved simpler tasks, like classifying objects in an image or comparing AI-generated text to human writing. But currently, tasks demand an understanding of highly specific terms and concepts, such as “acceptance of bills” or “operational risks” in finance and banking.
To address this, the company has begun conducting training sessions to upskill its workforce. “Some of our middle school and junior high graduates have left because the tasks have become more challenging for them,” says Yu.
Recruiting suitable candidates has been a persistent challenge for the company. Despite Qingjian’s population exceeding 2 million, the pool of potential employees is relatively small. “Around 200 people is the ceiling of what a small town’s demographic can offer when taking suitable age and education background into consideration,” says Professor Yang from Xi’an Jiaotong University.
Cao Yali returned to her hometown in Qingjian from Xi’an to care for her family. In AI labeling, she found a job that pays as well, if not better than her previous job, with the added benefit of a lower cost of living. She says she not only supports her parents but enjoys a better quality of life, recently buying a car and planning camping trips with coworkers.
Asked if she was worried about losing jobs to AI, she said her current job was quite stable, and that she’s constantly learning in order to improve herself. “Despite technological advancements, I believe there will always be a place for human workers.”
He Wei, CEO of Macro Management Consulting, underscores that the demand for AI labeling in China’s vast market will remain significant for the foreseeable future.
“AI’s expansion into various sectors isn’t slowing down, even if it might in areas where it’s already well-established,” she told Sixth Tone. To remain relevant, she says labelers and companies must enhance their skills and explore new sectors beyond the established sectors in transportation and content moderation.
She also highlights a new role on the horizon: the AI prompt engineer, a position that focuses on crafting queries for AI tools to ensure high-quality output.
She says: “This could open doors for those with less formal education, as fundamental tasks in AI prompting — like creating texts, videos, or images through dialogue with AI — do not demand an extensive cultural knowledge.”
Editor: Apurva.
(Header image: The Qingjian Aidou office, Shaanxi province, October 2023. Li Xin/Sixth Tone)