Ready to Bring AI Automation to Your Business?
Let's DiscussDuring the last fifteen years, several luminaries have raised the specter of massive job losses due to the advent of contemporary AI systems. Probably, the most influential article was in 2013 by Frey and Osborne, which stated “47% of total US employment is in the high-risk category, meaning that associated occupations are potentially automatable over some unspecified number of years, perhaps in a decade or two” [1]. Since United States will have a working population of around 180 million by 2033, Frey and Osborne’s article implies that around 85 million jobs will be lost within the next eight years. Similarly, about three weeks ago, in an interview with Axios, Dario Amodei, CEO of Anthropic, ominously mentioned that “AI could wipe out half of all entry-level white-collar jobs — and spike unemployment to 10-20% in the next one to five years” [2]. Despite Frey and Osborne’s prognostications, almost no jobs have been lost to AI so far. Nevertheless, such statements have sent ripples through boardrooms of countless organizations as well as within government bodies. Of course, the jobs mentioned by Frey and Osborne as well as those by Amodei may be lost in due course of time, but will they be lost during the next five to eight years or much more?
This article provides evidence that such a massive job loss is unlikely to occur for at least the next decade. It also argues that despite the hype regarding current AI systems, they are still quite rudimentary, their total cost of ownership (TCO) is enormous, and they are getting saturated around 92% (+/- 2%) accuracy level. Indeed, the main thesis of this article is that these characteristics will impede any massive number of job losses during the next ten years in the United States or elsewhere.
This article contains eight sections. Section 1 discusses why organizations are extremely slow in adopting any new technology, thereby restricting the speed of job losses. Section 2 discusses the debilitating effects of the apparent “saturation limit of 92% (+/-2%) accuracy” that is being exhibited by even the most sophisticated, contemporary AI systems. This level of accuracy is one of the main reasons why obtaining a decent return on investment (RoI) remains elusive. Sections 3, 4, and 5 discuss three vast areas where the 92% level of accuracy and the huge total cost of ownership (TCO) will continue to be the main obstacles in replacing human jobs. In contrast, Sections 6 and 7 discuss two major areas where this accuracy is already sufficient for substantial job reduction but won’t lead to massive job losses for at least a decade. Finally, Section 8 concludes with a discussion regarding job losses during the next twenty-five years and why such losses are likely to be humungous but perhaps even beneficial for human society.
In almost all situations, organizations change their modus operandi only if there is a specific solution (e.g., technology or process) that improves one of the following key performance indicators (KPIs):
For example, during the last 40 years, outsourcing of manufacturing and services jobs to China and India occurred because organizations in wealthy countries realized that they could save 35% to 40% in costs and improve timeliness (by having one set of people working in Asian time zones and another in European or American time zones).
Notably, even though outsourcing of manufacturing jobs to lower wage countries from the United States started around 1979, our analysis shows that by 2016, the U.S. had cumulatively lost around 18 million manufacturing jobs due to outsourcing. Similarly, outsourcing of service jobs from the U.S. began around 1995, our analysis shows that until 2016, the U.S. lost only around 12 million such jobs. Hence, job losses due to outsourcing totaled around 30 million, or around 20% of the working population in the U.S. in 2016. And this occurred over a large span of 21 to 37 years. Undoubtedly, if the global economy were frictionless (i.e., if all restraints and regulations associated with the economy were non-existent), almost all the 47% of the jobs predicted by Frey and Osborne would have been already lost via outsourcing within two decades to lower wage countries.
As discussed in the first chapter of [3], given below are a few reasons why organizations often take substantial time to fully integrate even the most vital inventions (e.g., those related to AI):
In addition, large organizations face massive challenges with respect to their internal data, which is often in silos (and hence very hard to access and with no harmonization), extremely “dirty” with missing or wrong elements, and with the same terms having different meanings in different departments. For example, , for the finance department, the number of units sold may mean the total number of non-defective units produced, whereas for the sales department, the number of units sold may mean the actual number of units that produced revenue but not the ones given for “free” in various marketing campaigns.
As mentioned in Section 1, organizations change their modus operandi only if there is a specific solution that improves one of the following key performance indicators (KPIs) – (a) revenue, (b) cost, (c) quality, (d) timeliness, and (e) customer satisfaction.
If an organization is currently solving a problem in a specific manner and if a new technique or solution – e.g., an AI-system – can improve one or more of these KPIs then this problem becomes a “use case” for this new technique. Since AI systems run on computational, storage, and networking infrastructure that uses semiconductors, they effectively communicate via electrons. In contrast, human neurons use ions for communication and hence are approximately ten thousand times slower. Hence, AI systems provide better timeliness than humans and recognize patterns much faster.
However, with respect to the other KPIs, AI systems fall short because (a) their total cost of ownership (TCO) is currently very high and (b) their accuracy level has saturated around 92% (+/-2%). These characteristics are further discussed below:
Traditional software requires regular maintenance, which is often called Software DevOps. This includes performing the following tasks on a regular basis – refactoring of code and deleting portions that are no longer required, streamlining code to make it modular (so that it could be easily understood by other developers), eliminating bugs, and improving testing, user interfaces and Application Programmable Interfaces or APIs (for input and output). On average, annual Software DevOps range between 10% and 20% of the total effort to build traditional software initially. Since AI systems essentially comprise of software, they also require traditional Software DevOps for regular maintenance.
However, to incorporate changes (because of incoming data or other reasons), many AI systems also need repeated re-doing of the pipeline partially or entirely. This includes data gathering, data labelling, AI model training, and repackaging. For example, many Deep Learning Networks (DLNs) and Large Language Models (LLMs) are usually updated via the following techniques, which are all quite expensive:
Our analysis shows that this process costs three to four times to maintain AI systems than traditional software, which implies that the annual maintenance cost of such systems is likely to range between 30% and 60% of building the first operational (or “in production”) version. Since the maintenance cost of contemporary AI systems is often exorbitant, this hinders a decent return on investment in the long run, thereby making operationalizing the proof of concept unviable.
Unfortunately, this enormous cost of maintaining AI systems is an uncomfortable issue, it is being largely avoided, thereby becoming a “white elephant in the room”. This is mainly because many AI companies are currently flooded with investor cash, and to gain more market share, they are providing AI systems at very cheap prices. In fact, more than 95% of such companies have been loss-making, and those that are profitable (like Palantir) charge exorbitant amounts. However, since the TCO of many such AI systems is enormous, eventually such subsidies will either force the AI hype to go bust or force these unprofitable companies to raise their current prices by a factor of ten, which will make their AI systems totally uncompetitive.
Undoubtedly, the more accurate an AI system becomes, the fewer humans are required to work on that task. For example, if the job of a data entry person is to transcribe data from a scanned document into a spreadsheet and if an AI system can achieve this goal with the human level of accuracy, then this data entry person is no longer required for manual entry of data. Hence, the eventual aim of AI systems continues to rival or beat humans with respect to accuracy for a given task, thereby making humans more productive and useful elsewhere.
Many creators of AI products currently provide laudatory claims regarding their systems being 99%+ accurate. And this may be true for a limited data set for which they have tested their systems, but experiments show that when tested on a reasonably large dataset, pretty much all trained DLNs and contemporary LLMs seem to “saturate at 92% (+/-2%) accuracy”. This is primarily because these large sets usually have “data with boundary cases” (which is often also called “corner case or edge case data”). In fact, improving this saturation limit of DLNs and LLMs is an active research area, and further information in this regard can be found in [4, 5, 6]. Unfortunately, even if the most accurate LLMs are combined with other AI systems, the overall accuracy of the composite system seems to be only around 92% (+/-2%), thereby almost no advantage (on an average).
Not only do these systems only provide 92% accuracy, they also do not provide any indication as to where they are wrong. This forces the user to check all the output coming from these systems. For example, if a Visual LLM (VLLM) converts a scanned table with 100 cells into an electronic format, then either the user accepts the eight incorrect cells being scattered in the output table or the user must check all 100 cells and determine the eight cells where this VLLM gave an incorrect answer and then fix these eight cells. Such “human in the loop” adds to the total cost of ownership of AI systems. This is starkly different from conventional software, which is rules based and therefore 100% correct (except for it having bugs, occasionally).
Finally, the use of confidence level of DLNs to predict inaccuracies does not work either. This is because DLNs often give wrong answers with 99% confidence. For example, in 2015, researchers at Carnegie Mellon University generated random images by perturbing patterns, and they showed both the original patterns and their mutated copies to previously trained DLNs [7]. And although the perturbed patterns were essentially meaningless, DLNs incorrectly recognized by these with over 99% confidence as a king penguin, starfish, baseball, etc.
This section provides two examples where the output accuracy needs to be 99% and the 92%-accuracy saturation limit makes contemporary AI systems extremely expensive and unaffordable. One such area is Intelligent Document Processing (IDP) and the other is “Knowledge Management” (where precise information is required).
Current IDP systems use AI to perform various operations, including extracting, capturing, processing, categorizing, and reconciling data from various document formats. Such formats include structured, semi-structured (e.g., forms), as well as unstructured formats (which may be in PDF or scanned files).
Since most organizations around the world have lots of unstructured data, converting it into structured electronic data is critical, according to Grandview Research, IDP’s total global market size was estimated at 2.3 billion dollars in 2024 and is expected to grow to 12.4 billion dollars by 2030 [8]. In fact, the area of IDP is so large that the book, “Intelligent Automation” and the website, “www.scryai.com/use-cases” provide several hundred potential use cases in this area [9, 10]. To illustrate the TCO challenge of using AI solutions for numerous use cases related to IDP, consider the following example of converting invoices (from scanned or PDF formats to a structured format):
Indian companies typically charge 24,000 dollars for converting 100,000 invoices, which come in scanned or PDF formats into a structured one. Typically, each of these invoices have an average of 18 key-value pairs (e.g., name of the vendor, sub-total amount, taxes levied, and total amount) that need to be extracted and reconciled. Before the advent of modern AI systems, these companies used inexpensive Optical Character Recognition (OCR) systems to convert paper-based invoices into an electronic format. Such OCR systems used to be around 80% accurate but did not provide any indication as to where they were wrong. Hence, after the OCR conversion, these companies used data analysts to ensure which 80% of the 1.8 million key-value pairs were correct and then fix the 20% that were incorrect. Since checking whether a key-value pair is correct requires around 2 seconds, checking for correctness of all key-value pairs would take 3.6 million seconds, i.e., 1,000 manual hours. Similarly, since correcting a wrong key value pair requires five seconds, this task would take 5 * (20% of 1,800,000) = 1.8 million seconds, i.e., 500 hours. Hence, the analyst spent a total of 1,500 hours, for which the company charged $24,000 (i.e., $16 per hour).
With the advent of improved AI systems, the accuracy of conversion has gone up to 92%. However, since these AI systems still do not provide any indication where the wrong key-value pairs exist, the data analyst would still need to spend 1,000 hours in finding the location of errors. However, this time around, the data analyst needs to only fix 8% of 1.8 million key-value incorrect pairs by spending five seconds each, thereby spending 72,000 seconds or 200 hours. Hence, with the modern AI system, the analyst would spend 1,200 hours and save 300 hours or $4,800 in total (assuming $16 per hour).
Now, the typical cost of using the modern AI system for the above task is around 10 cents per invoice [9], which means a cost of $10,000 for 100,000 invoices and a net loss of $10,000 – $4,800 = $5,200! This implies that rather than saving any money for the company or its client, the total cost of ownership for this AI system (including the cost of data analysts) is substantially more than that in the past.
With respect to saving costs or providing positive return on investment, the above example regarding invoices can be extended to more than a hundred use cases given in [10, 11], thereby making current AI systems for IDP essentially useless. Of course, TCO for AI systems would be substantially different if they also provided locations where they believed they were correct (and indeed they were correct). However, as mentioned in Section 3, this does not seem to be possible with the current AI systems.
Internal data of most organizations is stored in unstructured documents that usually contain the following objects:
Even though LLMs and GPTs are trained on half a billion to a billion pages or more, at most 5% of this training data is related to objects other than text. Hence, these LLMs are unable to provide precise answers when they are asked questions regarding documents containing such objects (e.g., tables). For example, if a question requires a precise answer (e.g., how many employees were on maternity leave last month) then since the LLM is correct only 92% of the time, the user is forced to check all the underlying data, thereby increasing manual labor markedly.
Finally, the above-mentioned use cases require the output to be 99%+ correct. However, Sections 5 and 6 provide a plethora of use cases where even 85% accuracy of AI systems improves the KPIs mentioned in Section 2, thereby improving the return on investment for the end client.
AI Agents are AI systems that interact with the world to perform specific tasks independently or semi-independently. Many organizations are trying to deploy these agents to execute complex tasks. But because of the apparent 92%-accuracy (+/-2%) saturation limit of the underlying AI systems, if a complex task is composed of several AI agents, this may lead to a very poor accuracy of the entire system and in some cases, the system may not be even able to finish the task.
For example, consider a complex task that comprises of ten jobs, which need to be executed by ten AI Agents that are configured in a single pipeline. Since each has 92% accuracy and since they are in a pipeline, the total accuracy of the system will be (0.92)10 , which implies that the entire solution will only have an accuracy of around 44%. This in turn implies that entire AI-based solution may not be able to finish most of the jobs, thereby requiring the data analyst (or “human in the loop”) to do most of the work. Hence, the TCO of the entire solution will become unaffordable.
In fact, such a hypothetical scenario was recently confirmed by researchers at Carnegie Mellon University, who staffed a fake software company with AI Agents [12]. This simulation called The Agent Company was provided with only artificial workers from Google AI, Amazon AI, Anthropic and Meta AI. To see how this simulation coped in a real-world environment, researchers provided tasks that were similar to the daily tasks in a real software company. For example, they navigated through file directories, virtually toured new office spaces, and wrote performance reviews.
The best-performing model was Anthropic’s Claude 3.5 Sonnet, which finished only 24% of the jobs assigned to it. And even this turned out to be astronomically expensive, averaging nearly 30 steps at a cost of six dollars per task. Google’s Gemini 2.0 Flash was the second highest. It took an average of 40 steps per finished task and had a success rate of 11.4% for finishing tasks. And the worst AI agent was Amazon’s Nova Pro v1, which took an average of 20 steps and finished only 1.7% of its assignments.
Interestingly, hallucinations and self-deception also occurred in The Agent Company. For example, according to [12], “during the execution of one task, the agent cannot find the right person to ask questions on [company chat]. As a result, it then decides to create a shortcut solution by renaming another user to the name of the intended user.”
In summary, because of the 92%-accuracy limit, AI agents can do small and restricted tasks reasonably well, but achieving accuracy for complex tasks remains elusive.
92%-accuracy of AI systems is also a limiting factor when humans are directly involved, and hence, they are usually unable to improve KPI metrics earlier. Given below are a few examples; other examples can be found in [3] and [13]:
There are at least two hundred use cases where 92% (or even less) accuracy of AI systems is sufficient to improve the KPIs given earlier. Since most of these use cases are related to individuals trying to improve their productivity and cost, they are not discussed in this article, whereas those related to organizations are discussed below.
Employees in organizations are using AI and Generative AI (including LLMs) in the following broad categories:
According to the U.S. Bureau of Labor Statistics, around 161.4 million people were working in 2024 in approximately 600 different occupations [15]. Our analysis shows that among these 600 categories, there were around 7.6 million, i.e., around 4.7% (of all working people in the U.S.). If contemporary AI systems are used, then instead of humans doing all the work, only 40% to 50% may be required for these 7.6 million jobs. Hence, if 55% of these jobs are lost during the next five years, the total number of jobs lost will be around 4.2 million (i.e., around 2.6% of entire workforce in the U.S. in 2024). Of course, since the working population in the U.S. will be higher in 2030, this implies that on average, around 0.5% of the entire workforce will lose jobs every year. Hence, this analysis clearly contradicts Amodei’s claim that “AI could wipe out half of all entry-level white-collar jobs — and spike unemployment to 10-20% in the next one to five years” [2]. Indeed, not only will the job loss be limited but also most such people are likely to end up doing higher-end or different kinds of work in the same or similar organizations. For example, entry level programmers learning quickly (using AI systems) and may end up gathering requirements or creating preliminary information technology architecture for their projects. Finally, a more extensive discussion regarding the effects of LLMs and Generative AI in displacing the kind of jobs mentioned above will be discussed in a subsequent article [16].
Decision Support Systems are computer programs that help organizations and individuals in making better decisions by analyzing large amounts of data and incorporating numerous variables. In fact, the amount of data is often so large and the number of variables so huge that without such systems, most organizations either do not attempt to solve the corresponding problem or do so by taking a random sample of data (thereby missing many key insights). Since AI systems usually provide pattern recognition much faster, even with 92% accuracy, these systems often help organizations by analyzing all the data (instead of only a random subset), thereby providing a higher return on investment. Given below is a use case that illustrates this point further.
If employees travel for work-related purposes, most organizations reimburse them for the cost of their airfare, taxi fare, food, hotel, and other travel related expenses. However, because of governmental restrictions and those imposed by these organizations, some items are non-reimbursable.
For getting their reimbursements, employees fill reimbursement forms (detailing their expenses) and provide receipts, invoices, and bills to justify them. In many organizations (e.g., consulting companies), since travel is an integral part of employees’ work, the number of reimbursement forms and receipts is usually so large that the manual cost of verifying each form and each receipt is prohibitive. Hence, compliance departments usually choose some forms randomly and determine whether all restrictions are being met. Unfortunately, in doing so, they end up missing many forms that are non-compliant, thereby causing the organization to lose money and potentially be non-compliant (with respect to government regulations). And the consequences of non-compliance can range from potential penalties from the government (especially during audits) to reputational damage and financial impairment.
Although there is no specific data regarding the percentage of non-compliant forms, a common belief within organizations is that around 2% are non-compliant. Given this backdrop, if all the travel reimbursement forms and the corresponding receipts are processed by an AI-based DSS then this system is likely to capture 92% of all non-compliant forms. Since the forms output by the AI-based DSS is around 92% * 2% (i.e., 1.84%) of all submitted forms, the compliance department needs to only check these forms and ensure that they are indeed non-compliant. Of course, the AI-based system may have missed around 8% * 2% (i.e., around 0.16%) of all forms that are also non-compliant, but because it has captured most of them, it has improved compliance by more than 90%. Also, it has saved money for the organization and avoided potential penalties (from the government).
Notably, since the data is so vast and the number of variables so large, for many DSS, 92%-accuracy of AI systems is sufficient to improve the outputs of the corresponding DSS. Hence, currently there are more than two thousand use cases where AI systems are being used. And despite the hype about Generative AI and its use cases, AI systems that improve DSS provide significantly better KPI metrics mentioned in Section 1, thereby providing a higher return on investment. Finally, even though there are more than 2,000 use cases of contemporary DSS that use AI-systems, usually such DSS do not displace human jobs. In fact, since they often increase revenue or decrease cost, organizations often hire additional employees to improve their systems, processes, and products.
Since the seminal paper by Frey and Osborne in 2013 [1], people have been dreadful of losing jobs due to AI. But as discussed above, very few jobs have been lost so far, and very few will be lost during the next 5-8 years. This is because for AI systems to displace humans, they must be cheaper, better, and faster. And although they are faster than humans because they transmit via electrons whereas humans transmit via ions, they are generally inferior both with respect to cost and accuracy. Hence, Sections 3, 4, and 5 provided use cases where these AI systems currently have a huge TCO (total cost of ownership) and hence unviable economically. On the other hand, Sections 6 and 7 provide use cases where current AI systems are sufficient to provide a high RoI (return on investment). These five sections are summarized below:
For the last decade, a gigantic amount of investment and intellectual capital has been going into AI research and development. If this continues, it is quite likely that there will be a breakthrough within the next 10-15 years, which will allow the new AI systems to breach the 92% (+/-2%) accuracy barrier. In fact, it won’t be surprising if such AI systems achieve or surpass human level accuracy. And in such a case, a massive number of jobs will be displaced. Given this backdrop, Chapter 16 in [3] discusses the following colossal shifts due to AI and other factors (during 2025-2050):
Increase in labor supply by 2050: The global working population will increase from 3.2 billion in 2021 to 3.9 billion in 2050 and hence to maintain the status quo, an additional 700 million jobs need to be created.
Increase in labor demand by 2050: Demand for human labor will be governed by the following monumental factors:
In short, if human society decides to combat climate change on a war footing, it is quite likely that 1,310 million new jobs will be created by 2050 whereas 395 million jobs will be lost. Hence, the number of jobs created in the current industrial revolution may exceed those that will be lost by almost 915 million but there will be only 700 million new workers available in the market (by 2050). And this will imply more reliance on Artificial Intelligence, Robotics, automation, and other key inventions of the current industrial revolution. In summary, the above discussion implies that we as humans may be working harder in 2050 than we are currently, which is contrary to the views held in the contemporary media. Finally, since these estimates are for the next 25 years, most of these numbers should only be considered from a qualitative perspective.
Dr. Alok Aggarwal received his PhD in Electrical Engineering and Computer Science from Johns Hopkins University and worked at IBM Watson Research Center during 1984 and 2000. During 1989-90, he taught at MIT and advised two PhD students and during 1998-2000, he founded IBM India Research Lab. and grew it to 60 researchers. He co-founded Evalueserve (www.evalueserve.com) in 2000 and was its chairman until 2013; this company provides research and analytics services worldwide and has 4,500 employees. In 2014, Dr. Aggarwal founded Scry Analytics (www.scryai.com). Scry AI is a research and development company that uses AI and Data Science to help its clients in solving complex and extremely laborious problems. Scry AI has developed more than 60 proprietary AI-based models and algorithms which constitute its CognitiveBricks platform of innovative business solutions. Scry AI’s family of enterprise solutions include: Collatio (an Intelligent Document Processing factory with unparalleled accuracy for reconciling unstructured and structured data), Auriga (for knowledge management on organizations’ internal and external data), Concentio (for providing actionable insights using Internet of Things’ data), Vigilo (for predicting operational and marketing risks), and Datatio (for extracting data lineage as data flows through disparate systems).