New Book on Artificial Intelligence Explores the Multifaceted Nature of Data

Chapter 12: Overview

In the current technological landscape, data is often hailed as the “new oil” or the “new electricity,” becoming a cornerstone of the Fourth Industrial Revolution. However, this analogy oversimplifies the multifaceted nature of data. Unlike electricity, which can only be consumed once, data remains undiminished after multiple uses. Additionally, the value of datasets is context-dependent, challenging the notion that larger datasets always yield proportionally greater returns. This complex interplay of the multifaceted and complex nature of data is explored in-depth in the twelfth chapter of Alok Aggarwal’s new book,”The Fourth Industrial Revolution & 100 Years of AI (1950-2050).” Today, we briefly discuss the twelfth chapter of Alok Aggarwal’s new book, “Multifaceted Nature of Data.”

Key Takeaways from the Chapter

1. Bias in data and why it is hard to eliminate: Bias in data arises because of biases in humans who are collecting it, annotating it, harmonizing it, reconciling it, and then using it for training AI systems. Efforts are ongoing for minimizing these biases and although some progress has been made, the likelihood of eliminating biases in data is closely linked to that eliminating biases in humans.

2. Training AI systems with biased data can be hazardous: Because AI systems are brittle (i.e., their accuracy deteriorates tremendously after adding even small noise), training these systems with biased data can yield wrong results, thereby hurting humans especially in domains related to healthcare, product safety, robotics, criminal justice system, recruiting, autonomous car driving, military, and defense.

3. Other facets of data that need to be incorporated while training AI systems: These idiosyncrasies include (a) unclear definition of data ownership, (b) confidentiality, privacy, and security, (c) consent and purpose, as well as (d) auditability and lineage.

4. Dissimilar societies will handle various facets of data differently: To manage the above-mentioned quirks of data, different societies will adopt different approaches and it will be almost impossible to come up with a universal set of rules that will govern the use of such datasets. Whereas some are likely to tilt towards over-protection by stressing individual rights and passing overbearing regulations, others may emphasize communal benefits of exploiting both public and private data. Of course, in this process, the first group may lose some of the potential benefits that this trove of data will generate, whereas the second may end up minimizing individual rights. Such disparate behavior among societies and countries may also lead them to handle contemporary social media sites and companies (e.g., Twitter, Facebook-Meta, Google) differently especially because of their explicit or perceived bias in the data contained in these websites.

5. Synthetic data may be able to overcome some of the peculiarities of real data: Two kinds of Deep Learning Networks – Generative Adversarial Networks and Diffusion Model-based Networks – are being used to provide synthetic data to avoid some of the idiosyncrasies that are related to real data. However, this promising field is still evolving, and it may be several years before it produces high quality that can be used by AI systems for solving vital use cases in important domains such as healthcare or transportation.

Conclusion

Overall, the book, “The Fourth Industrial Revolution & 100 Years of AI (1950-2050)” provides a concise yet comprehensive exploration of AI, covering its origins, evolutionary trajectory, and its potential ubiquity during the next 27 years. Beginning with an introduction to the fundamental concepts of AI, subsequent chapters delve into its transformative journey with an in-depth analysis of achievements of AI, with a special focus on the potential for job loss and gain. The latter portions of the book examine the limitations of AI, the pivotal role of data in enabling accurate AI systems, and the concept of “good” AI systems. It concludes by contemplating the future of AI, addressing the limitations of classical computing, and exploring alternative technologies (such as Quantum. Photonics, Graphene, and Neuromorphic computing) for ongoing advancements in the field. This book is now available in bookstores and online retailers in Kindle, paperback, and hard cover formats.

Press Contacts

Srini Bharadwaj

Scry Analytics, Inc.
+1 781-929-0669
[email protected]