Transformer (machine learning model)

From Simple English Wikipedia, the free encyclopedia

A transformer is a computer model used for deep learning, which is a kind of machine learning where computers teach themselves. Transformers were introduced in a 2017 paper "Attention Is All You Need" by a Google Brain team.[1] Transformers are popular for large-scale language training and work by tokenizing text, which means they change words into a format (like a list of numbers) for easier analysis.[2] Transformers process multiple parts of an input sequence simultaneously.[3] This is in contrast to older and slower sequential models that process data one step at a time.[4] Transformers are used in various fields including language, images, and audio, leading to models like GPT which powers chatbot ChatGPT.

References[change | change source]

