Transformers in Machine Learning: Literature Review

: In this study, the researcher presents an approach regarding methods in Transformer Machine Learning. Initially, transformers are neural network architectures that are considered as inputs. Transformers are widely used in various studies with various objects. The transformer is one of the deep learning architectures that can be modified. Transformers are also mechanisms that study contextual relationships between words. Transformers are used for text compression in readings. Transformers are used to recognize chemical images with an accuracy rate of 96%. Transformers are used to detect a person's emotions. Transformer to detect emotions in social media conversations, for example, on Facebook with happy, sad, and angry categories. Figure 1 illustrates the encoder and decoder process through the input process and produces output. the purpose of this study is to only review literature from various journals that discuss transformers. This explanation is also done by presenting the subject or dataset, data analysis method, year, and accuracy achieved. By using the methods presented, researchers can conclude results in search of the highest accuracy and opportunities for further research.


Introduction
At first, the transformer is a neural network architecture that is considered as input (Reinauer et al., 2021;Grechishnikova, 2021;Ramos-Pérez et al., 2021;Moutik et al., 2023;Röder, 2023;Lin et al., 2022;Abed et al., 2023).In principle, a transformer is needed to solve sequential problems such as sentences, which is an artificial neural network architecture (Luitse et al., 2021;Taye, 2023;Yang & Wang, 2020).The transformer is in charge of connecting the encoder and decoder to the text in stages (Singla et al., 2020;Alqudsi et al., 2019;Liu & Chen, 2022) input and output are interconnected via context vectors.The autoencoder is capable of converting input into output.Besides being used in Natural Language Process transformers, it is also used in computer vision (Ghojogh et al., 2020;He et al., 2023).Transformer is a concept in Natural language Processing (NLP) in Deep Learning (Singla et al., 2020).
Transformer is part of NLP, an open-source library (Singla et al. ,2020).The Natural Language process uses many words that are processed through transformers (Singla et al., 2020;Khurana et al., 2023;Caucheteux & King, 2022).Several studies use transformers, namely the application of machine learning using the transformer method in the health sector (Alqudsi et al., 2019;Arshed et al., 2023;Sanmarchi et al., 2023).Transformers used to examine people's views through politics on social media apply BERT, BERT, LSTM, Support Vector Machine, Decision Trees, Naïve Bayes, and Electra.In this study, the highest accuracy value used the Electra model with a value of 70% (Öztürk et al., 2022).The use of TurnGPT transformer model is used in spoken dialogue (Ekstedt et al., 2020).Research that applies transformers to musical chord recognition using the bi-directional Transformer for chord recognition (BTC) method (Park et al., 2019).Transformer using block model (Yan Xiao A et al., 2020).Transformer imaging using 3D on the plane (Dong Yang et al, 2021).
Transformers are used for text compression in readings (Li et al., 2021).Transformers are used to recognize chemical images with an accuracy rate of 96% (Rajan et al., 2021).Transformers are used to detect a person's emotions (Zhong et al., 2019;Graterol et al., 2021;Ghosh et al., 2023).Transformer to detect emotions in social media conversations, for example on Facebook with categories of happy, sad, and angry (Zhong et al., 2019;Acheampong et al., 2020;Li et al., 2020).Machine learning uses a transformer model with 1000+ transformers to analyze health problems (Wu et al., 2020).Transformers were used to predict the next 10 weeks with influenza case subjects (Rendragraha et al., 2021;Santangelo et al., 2023;Piccialli et al., 2021).

Theory
The transformer model is also used to detect abusive language in Indonesian online news comments.Classification of the text is done with the category of offensive, normal, or non-offensive (Tran et al., 2020).The transformer model is used for energy monitoring using the concept of algorithms in machine learning in the health sector (Valencia et al., 2021).Transformer method to analyze a person's stress (Vaswani et al., 2017).The transformer model is used for the Classification of Music in Indonesia (Thoyyibah et al., 2022).Transformers are also mechanisms that study contextual relationships between words.Figure 1 illustrates the encoder and decoder process through the input process and produces output.In Transformer there are two mechanisms, namely:

Code Generator
Encoder is used to read all text input at once.The encoder consists of a stack of identical layers.Each layer has two sub-layers, namely the self-attention layer and the feed-forward neural network.with a self-attention layer, the encoder can help nodes not only focus on the word they see but also get the semantic context of the word.Each position in the encoder can handle all positions in the previous layer in the encoder.

Decoder
The decoder is used to generate predictive output sequences.The decoder also consists of a stack of identifiable layers.Each layer consists of two sub-layers like those of the encoder, with an additional attention layer between the two layers to help the current node get to the main content it needs to pay attention to by performing multi-head attention on the encoder output.Similar to the encoder, the self-attention layer in the decoder makes each position in the decoder capable of handling all previous and current positions.From the various journals described earlier, researchers are interested in studying transformers.In several aspects, this research is different from previous research.Focuses on critical thinking skills.First, this research focuses on all the articles that have been published from 2017 to 2022; almost all of which are international journals.Second, this research is devoted to further reviewing various articles with critical thinking skills as the main focus being discussed.Third, various parameters are used as the basis for content analysis, through analysis of methods, accuracy, and others.

Research Design
This research journal has a principle for analyzing the contents of previous literature reviews, which focuses on the findings of various studies that have been published in various international journals.

Data Source
The data collected from the analysis in the form of the contents of this literature review comes from international journals about transformers in machine learning.All journals are taken from various sources of international journals.Hence, all the contents of the literature review that reviews transformers in machine learning are collected from each of these journals.The literature review analyzed in this study has been published online, in a total of 23 journals that examine transformers in machine learning.All literature reviews are analyzed in this study to find out what methods are used in transformers in machine learning.

Research Instruments
This research is a literature review of several research journals related to transformers in machine learning.The review was conducted on some of the most recent research efforts utilizing machine learning.Furthermore, this study comes from several literacies and includes problem-solving efforts that are divided into areas from the perspective of each machine learning category with various methods available in the research journal.The data collection process used to examine some of literature very useful for finding and obtaining study sources based on previous relevant research.Supporting theories, data, and information as references in the documentation.
There are several main aspects in Table 1 which are presented in this research journal from the literature review of several international journals.These aspects include (A) type of research, (B) year, (C) Dataset, (D) Transformer Model, (E) Accuracy, (F) Field of Research.

Result and Discussion
The number of article publications shows how high the accuracy value of the research carried out in a certain period is.Referring to the graph shown in the table above, which reviews a transformer method that shows the level of accuracy in each method, shows that there is an average highest accuracy value of up to 90%, and there is also an average accuracy value of only 10%.In addition, the number of publications each year always appears using different methods.The trend of research using different methods shows that transformers can be implemented in various fields, not referring to only one field.

Type of Research
Some researchers use this type of experimental research.The type of experimental research here is almost all using a public or unpublic database.Besides that, some researchers are more interested in studying transformers in theory, so the authors also use literature reviews for several studies from various sources.Figure 1 is a graph of the journal's analysis of the transformer.In Figure 2 there are 2 types of research, namely literature review and experiment.This study examines many journals the largest number of which is 78% in the experimenter and 22% of the total journals studied.

Research Year
Some researchers use this type of experimental research.The type of experimental research here is almost all using a public or unpublic database.Besides that, some researchers are more interested in studying transformers in theory, so the authors also use literature reviews for several studies from various sources.namely 11, 2021, 6, 2019, 3, 2017, 1, 2022, 1.Of the five years, there was 2018 when there was no research, so this is an opportunity for transformer research in 2018.

Datasets
This study in Figure 4, uses public datasets and private datasets.Public datasets are datasets that are easy to search for.Private datasets are hard-to-find datasets with certain approvals.Private datasets are usually used in the health sector.

Research Methods
In this study, there are several methods used in making the research method consisting of transformers, BERT, and GPT.BERT and GPT are variants of the transformer itself.The use of the transformer method is in Figure 5. Dominates more than the others, this is an opportunity to explore the BERT and GPT methods.Of the 23 papers I read, 16 used transformers, 6 used BERT, and 1 used GPT.

Level of Accuracy
The accuracy rate in Figure 6 averages above 40%.Some of the studies analyzed did not mention their accuracy, only told the process of collecting data or comparing methods.A total of 10 of the 23 journals analyzed reported the accuracy achieved.1 journal 54% [20].There is 1 that reaches 99.1% [1].

Figure 3 .
Figure 3. Year of research

Figure 6 .
Figure 6.Accuracy levelField of Data Analysis

Figure 7 .
Figure 7. Field of analysis

Table 1 .
Aspects and Categories Used for Content