BLOG

Generative AI: What it is and types of applications

Generative Artificial Intelligence (GEN AI) opens up a world of possibilities when it comes to creating new and original content. This branch of AI is capable of generating text, software code, images, videos, sound, product designs, and structures. It is all based on the use of machine learning algorithms and models, such as Generative Adversarial Networks (GANs) or Transformers, which can learn from a wide range of data.

Once trained, these models can generate new content from an initial input or simply by generating random samples. This ability to generate novel content makes Generative AI a powerful technology with a multitude of practical applications. However, it brings with it certain challenges as the content generated may have significant social and cultural implications.

Therefore, in this article published in our technology section, we will take a journey through the history of Generative AI, discover different types and examples of applications that use it, and the possible risks that its use may entail.

How we arrived at Generative AI

The concept of Artificial Intelligence (AI) was first coined in a paper entitled A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence (1955),written by John McCarthy with the participation of a renowned group of researchers including Marvin Lee Minsky, Nathaniel Rochester and Claude Shannon. This text contained a formal proposal to carry out a study on artificial intelligence, during the summer of 1956, at Dartmouth College in Hanover (New Hampshire), under the premise that “every aspect of learning, or any other feature of intelligence, can in principle be so precisely described that a machine can be made to simulate it”. In addition, it attempted to find “how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves”.

This was a declaration of intent from the authors of this initiative and which would end up being forged, a short time later, in McCarthy’s definition of AI, understood as “the science and engineering of making intelligent machines, especially intelligent computer programs”.

Years later, in 1966, Joseph Weizenbaum of the MIT AI Laboratory created the first chatbot or conversational bot, which he called ELIZA. This computer program could simulate a psychotherapist in order to hold conversations with a patient.

In the 1980s and 1990s, one of the most important milestones in the development of generative AI was the emergence of probabilistic generative models, such as Bayesian Networks and Hidden Markov Models (HMMs). These allowed AI systems to make more complex decisions and generate more diverse results. However, generating high-quality content remained a challenge well into the 2000s.

To be precise, it was in the 2010s that generative AI experienced a major breakthrough with deep learning models such as Generative Adversarial  Networks (GANs) and, later, Variational Autoencoders (VAE).

GANs, proposed by Ian Goodfellow in 2014 and his research team, are two neural networks that can interact with each other and learn from data in order to generate completely new content. In the process, a generator and a discriminator neural network work in tandem, with the generator creating synthetic samples and the discriminator trying to distinguish between the generated and real samples. As these networks compete with each other, the generator improves its ability to create more realistic and convincing content.

VAEs are a type of neural network that is used in an unsupervised learning context. The process starts with a neural network called an ‘encoder’ that captures data, from images or text, and transforms it into a numerical representation. Subsequently, a certain level of uncertainty is introduced into this representation, so that the other neural network called “decoder” can generate other uncertain representations and, from these, generate new and varied data to create, for example, new images or texts. Finally, after a training phase, and using real data examples, VAEs learn to generate unique and different versions of images or texts, related to the original data.

In 2017, researchers from Google Brain, Google Research and the University of Toronto, including Ashish Vaswani, Noam Shazeer, and Niki Parmar, published Attention is All You Need. This scientific paper represents a milestone in the history of AI. It introduced the neural network architecture, called Transformer, which revolutionized the field of Natural Language Processing (NLP) and is the basis of today’s Large Language Models (LLMs).

To clarify, it should be noted that artificial intelligence language models are the result of the combination of Natural Language Processing (NLP) and Natural Language Generation (NLG). Specifically, NLP equips computers with tools to understand and process natural language (which is used by humans to communicate, whether written or spoken), while NLG does so for the generation of text or even speech using natural language. Thus, LLMs are used to understand and generate human language automatically, performing tasks such as translation, text generation, summarization and correction, and answering questions. They use machine learning algorithms and large data sets (corpora) to learn linguistic patterns and rules.

The Transformer neural network architecture proved to be highly effective in a variety of NLP tasks, including machine translation, text generation, question answering, identity identification, text disambiguation, etc. Moreover, unlike traditional architectures based on recurrent layers, this new Deep Learning model is based on an attention mechanism that allows it to value different parts of an input, such as text, according to their importance. It also learns context and thus meaning by tracking relationships in sequential data such as the words in a sentence. BERT (Bidirectional Encoder Representations from Transformers) was the first LLM created in 2018 by Google, based on Transformer networks. It was followed by GPT with its different versions and Bard.

Today’s technology market offers a wide and varied range of generative AI applications, which we will see below and which we will group into different categories according to their content.

Generative AI applications

1. Text: This section includes applications aimed at generating creative texts, automatic summaries of long texts, content correction and chatbots that can hold conversations with users.

ChatGPT: Developed by OpenAI, this chatbot is able to generate content, just like a human being, and answer complete questions. Additionally, AuraQuantic offers an integration connector with Azure OpenAI Service on its marketplace. It provides access to OpenAI language models and data privacy with the guarantee of Microsoft Azure and can be adapted to carry out specific tasks and combine them with the platform’s plethora of functionalities.

Copy.ai: AI-based tool that enables content creation for e-commerce, hosted blog posts, advertisements, social networks, and websites.

Grammarly: Offers real-time suggestions while writing a text to improve grammar, punctuation, and style.

2. Images: Applications for automatic generation and editing of realistic images and even generation of original artwork in any style.

DALL·E: OpenAI is behind this AI system for creating realistic images and art from natural language descriptions.

Stable Diffusion: Designed to generate digital images from natural text.

NVIDIA Canvas: Uses AI to produce photorealistic images from simple sketches.

3. Code: These generative AI tools are used to speed up the software development process by automatically generating new code.

GitHub Copilot: Helps programmers write code faster. It works with OpenAI Codex, a pre-trained generative language model created by the company responsible for ChatGPT.

Tabnine: An add-on to several integrated development environments (IDEs) that uses AI techniques to generate real-time, useful suggestions while writing programming code.

DeepCode: Powered by the Snyk platform, it helps developers improve the quality and security of source code by providing suggestions and detecting errors.

4. Audio: For creating musical compositions and improving audio quality in recordings.

Amper Music: One of the easiest-to-use AI music generators on the market, as no knowledge of music composition is required.

Auphonic: Audio post-production web tool for professional quality results.

Murf.ai: This is one of the most popular AI speech generators. Any user can convert text to speech (TTS), voice-over and dictation, which is very useful for product developers, customer service, podcasters, marketers, and other professional profiles.

5. Video: This type of generative AI tool is aimed at generating synthetic videos using algorithms computer rendering techniques and deepfakes.

Synthesia: AI video creation platform that promises professional results in as little as 15 minutes, without the need for special editing equipment or skills.

Runway Gen-1: Uses words and images to generate new videos from existing ones. In addition, Runway Gen-1 was used in some scenes of the 7 Oscar-winning film Everything Everywhere All at Once.

Reface: Face swapping application in video clips and GIFs to create deepfakes.

6. Design: Generative AI applications focused on product design, for idea generation and for design optimization and customization.

Generative Design: Autodesk is the company responsible for this tool that uses AI algorithms for the design and manufacture of products based on established requirements.

Ansys Discovery: A simulation-driven design tool that reveals critical information early in the design process (prototyping).

nTop: Formerly known as nTopology, this is a CAD tool that has revolutionized the way engineers design parts for additive manufacturing, related to aerospace, medical, construction, automotive and consumer industries.

Risks of Generative AI

The final section is dedicated to identifying the main risks of generative AI. This is an issue that is currently generating controversy among numerous figures and social groups, such as the scientific community, regulatory bodies and legislators, companies, advocacy groups and activists, as well as users and the general public.

This is stated in an article entitled “The dark side of generative artificial intelligence: A critical analysis of controversies and risks of ChatGPT”(2023) published in the quarterly scientific journal Entrepreneurial Business and Economics Review (EBER), funded by the Krakow University of Economics. The text signed by a large team of researchers from different academic institutions identifies and provides a comprehensive understanding of the challenges and opportunities associated with the use of generative models, which include:

No regulation of the AI market and urgent need to establish a regulatory framework.
Poor quality of information, lack of quality control, disinformation, deepfake content, algorithmic bias.
Automation-spurred job losses.
Personal data violation, social surveillance and privacy violation.
Social manipulation, weakening ethics and goodwill.
Widening socio-economic inequalities.
AI-related technostress.

Given these threats, some countries have decided to take action. By the end of 2023, the EU is expected to reach an agreement on the form or structure of the Artificial Intelligence Law in the European Council. Thus, the first comprehensive law on AI in the world will aim to ensure favorable conditions for the development and application of this technology in different fields. This is a challenge to ensure that AI systems used in the EU are safe, transparent, traceable, non-discriminatory, and environmentally friendly.